在本文中,我们旨在改善干扰限制的无线网络中超级可靠性和低延迟通信(URLLC)的服务质量(QoS)。为了在通道连贯性时间内获得时间多样性,我们首先提出了一个随机重复方案,该方案随机将干扰能力随机。然后,我们优化了每个数据包的保留插槽数量和重复数量,以最大程度地减少QoS违规概率,该概率定义为无法实现URLLC的用户百分比。我们构建了一个级联的随机边缘图神经网络(REGNN),以表示重复方案并开发一种无模型的无监督学习方法来训练它。我们在对称场景中使用随机几何形状分析了QoS违规概率,并应用基于模型的详尽搜索(ES)方法来找到最佳解决方案。仿真结果表明,在对称方案中,通过模型学习方法和基于模型的ES方法实现的QoS违规概率几乎相同。在更一般的情况下,级联的Regnn在具有不同尺度,网络拓扑,细胞密度和频率重复使用因子的无线网络中很好地概括了。在模型不匹配的情况下,它的表现优于基于模型的ES方法。
translated by 谷歌翻译
在本文中,我们研究了启用高速雾无线电访问网络(F-RAN)中的内容受欢迎程度预测问题。为了以高准确性和低复杂性预测内容的流行,我们提出了基于高斯流程的回归器,以模拟内容请求模式。首先,我们提出的模型捕获了内容特征和受欢迎程度之间的关系。然后,我们利用贝叶斯学习来训练模型参数,这对于过度拟合非常可靠。但是,贝叶斯方法通常无法找到后验分布的闭合形式表达。为了解决此问题,我们采用随机方差降低梯度哈密顿蒙特卡洛(SVRG-HMC)方法来近似后验分布。为了利用其他FOG接入点(F-AP)的计算资源并减少开销的通信,我们提出了一个量化的联合学习(FL)框架与贝叶斯学习相结合。量化的联合贝叶斯学习框架允许每个F-AP在量化和编码后将梯度发送到云服务器。它可以有效地实现预测准确性和通信间接费用之间的权衡。仿真结果表明,我们提出的政策的绩效优于现有政策。
translated by 谷歌翻译
在本文中,研究了FOG无线电访问网络(F-RAN)中的内容流行度预测问题。基于聚集的联合学习,我们提出了一种新颖的移动性知名度预测策略,该政策将内容受欢迎程度整合在本地用户和移动用户方面。对于本地用户,通过学习本地用户和内容的隐藏表示形式来预测内容的普及。本地用户和内容的初始功能是通过将邻居信息与自我信息结合在一起来生成的。然后,引入了双通道神经网络(DCNN)模型,以通过从初始功能中产生深层特征来学习隐藏表示形式。对于移动用户,通过用户偏好学习预测内容流行。为了区分内容受欢迎程度的区域变化,采用了聚类联合学习(CFL),这使具有相似区域类型的雾接入点(F-APS)彼此受益,并为每个F-AP提供更专业的DCNN模型。仿真结果表明,我们提出的政策对传统政策实现了重大的绩效提高。
translated by 谷歌翻译
FOG无线电访问网络(F-RAN)是一项有前途的技术,用户移动设备(MDS)可以将计算任务卸载到附近的FOG接入点(F-APS)。由于F-APS的资源有限,因此设计有效的任务卸载方案很重要。在本文中,通过考虑随时间变化的网络环境,制定了F-RAN中的动态计算卸载和资源分配问题,以最大程度地减少MD的任务执行延迟和能源消耗。为了解决该问题,提出了基于联合的深入强化学习(DRL)算法,其中深层确定性策略梯度(DDPG)算法在每个F-AP中执行计算卸载和资源分配。利用联合学习来培训DDPG代理,以降低培训过程的计算复杂性并保护用户隐私。仿真结果表明,与其他现有策略相比,提议的联合DDPG算法可以更快地实现MDS更快的任务执行延迟和能源消耗。
translated by 谷歌翻译
Masked image modeling (MIM) performs strongly in pre-training large vision Transformers (ViTs). However, small models that are critical for real-world applications cannot or only marginally benefit from this pre-training approach. In this paper, we explore distillation techniques to transfer the success of large MIM-based pre-trained models to smaller ones. We systematically study different options in the distillation framework, including distilling targets, losses, input, network regularization, sequential distillation, etc, revealing that: 1) Distilling token relations is more effective than CLS token- and feature-based distillation; 2) An intermediate layer of the teacher network as target perform better than that using the last layer when the depth of the student mismatches that of the teacher; 3) Weak regularization is preferred; etc. With these findings, we achieve significant fine-tuning accuracy improvements over the scratch MIM pre-training on ImageNet-1K classification, using all the ViT-Tiny, ViT-Small, and ViT-base models, with +4.2%/+2.4%/+1.4% gains, respectively. Our TinyMIM model of base size achieves 52.2 mIoU in AE20K semantic segmentation, which is +4.1 higher than the MAE baseline. Our TinyMIM model of tiny size achieves 79.6% top-1 accuracy on ImageNet-1K image classification, which sets a new record for small vision models of the same size and computation budget. This strong performance suggests an alternative way for developing small vision Transformer models, that is, by exploring better training methods rather than introducing inductive biases into architectures as in most previous works. Code is available at https://github.com/OliverRensu/TinyMIM.
translated by 谷歌翻译
Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.
translated by 谷歌翻译
Benefiting from the intrinsic supervision information exploitation capability, contrastive learning has achieved promising performance in the field of deep graph clustering recently. However, we observe that two drawbacks of the positive and negative sample construction mechanisms limit the performance of existing algorithms from further improvement. 1) The quality of positive samples heavily depends on the carefully designed data augmentations, while inappropriate data augmentations would easily lead to the semantic drift and indiscriminative positive samples. 2) The constructed negative samples are not reliable for ignoring important clustering information. To solve these problems, we propose a Cluster-guided Contrastive deep Graph Clustering network (CCGC) by mining the intrinsic supervision information in the high-confidence clustering results. Specifically, instead of conducting complex node or edge perturbation, we construct two views of the graph by designing special Siamese encoders whose weights are not shared between the sibling sub-networks. Then, guided by the high-confidence clustering information, we carefully select and construct the positive samples from the same high-confidence cluster in two views. Moreover, to construct semantic meaningful negative sample pairs, we regard the centers of different high-confidence clusters as negative samples, thus improving the discriminative capability and reliability of the constructed sample pairs. Lastly, we design an objective function to pull close the samples from the same cluster while pushing away those from other clusters by maximizing and minimizing the cross-view cosine similarity between positive and negative samples. Extensive experimental results on six datasets demonstrate the effectiveness of CCGC compared with the existing state-of-the-art algorithms.
translated by 谷歌翻译
As one of the prevalent methods to achieve automation systems, Imitation Learning (IL) presents a promising performance in a wide range of domains. However, despite the considerable improvement in policy performance, the corresponding research on the explainability of IL models is still limited. Inspired by the recent approaches in explainable artificial intelligence methods, we proposed a model-agnostic explaining framework for IL models called R2RISE. R2RISE aims to explain the overall policy performance with respect to the frames in demonstrations. It iteratively retrains the black-box IL model from the randomized masked demonstrations and uses the conventional evaluation outcome environment returns as the coefficient to build an importance map. We also conducted experiments to investigate three major questions concerning frames' importance equality, the effectiveness of the importance map, and connections between importance maps from different IL models. The result shows that R2RISE successfully distinguishes important frames from the demonstrations.
translated by 谷歌翻译
Compressed videos often exhibit visually annoying artifacts, known as Perceivable Encoding Artifacts (PEAs), which dramatically degrade video visual quality. Subjective and objective measures capable of identifying and quantifying various types of PEAs are critical in improving visual quality. In this paper, we investigate the influence of four spatial PEAs (i.e. blurring, blocking, bleeding, and ringing) and two temporal PEAs (i.e. flickering and floating) on video quality. For spatial artifacts, we propose a visual saliency model with a low computational cost and higher consistency with human visual perception. In terms of temporal artifacts, self-attention based TimeSFormer is improved to detect temporal artifacts. Based on the six types of PEAs, a quality metric called Saliency-Aware Spatio-Temporal Artifacts Measurement (SSTAM) is proposed. Experimental results demonstrate that the proposed method outperforms state-of-the-art metrics. We believe that SSTAM will be beneficial for optimizing video coding techniques.
translated by 谷歌翻译
Transformer has achieved impressive successes for various computer vision tasks. However, most of existing studies require to pretrain the Transformer backbone on a large-scale labeled dataset (e.g., ImageNet) for achieving satisfactory performance, which is usually unavailable for medical images. Additionally, due to the gap between medical and natural images, the improvement generated by the ImageNet pretrained weights significantly degrades while transferring the weights to medical image processing tasks. In this paper, we propose Bootstrap Own Latent of Transformer (BOLT), a self-supervised learning approach specifically for medical image classification with the Transformer backbone. Our BOLT consists of two networks, namely online and target branches, for self-supervised representation learning. Concretely, the online network is trained to predict the target network representation of the same patch embedding tokens with a different perturbation. To maximally excavate the impact of Transformer from limited medical data, we propose an auxiliary difficulty ranking task. The Transformer is enforced to identify which branch (i.e., online/target) is processing the more difficult perturbed tokens. Overall, the Transformer endeavours itself to distill the transformation-invariant features from the perturbed tokens to simultaneously achieve difficulty measurement and maintain the consistency of self-supervised representations. The proposed BOLT is evaluated on three medical image processing tasks, i.e., skin lesion classification, knee fatigue fracture grading and diabetic retinopathy grading. The experimental results validate the superiority of our BOLT for medical image classification, compared to ImageNet pretrained weights and state-of-the-art self-supervised learning approaches.
translated by 谷歌翻译